AITopics | start state

Collaborating Authors

start state

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

e8da56eb93676e8f60ed2b696e44e7dc-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 03:54:24 GMT

The goal location is small region around (20,20). In each task, S0 was a set of arm con gurations establishing contact with the 539 end-effector, the 6-DoF change in stiffness, and 1-DoF gripper state. The fraction of start states in S0 that lead to success 557 IVF, classi er). The result of that execution is recorded as 552 Algorithm 1 is the pseudocode used for the experiments described in Section 4.1. Episodes last a maximum of 1000 steps.

artificial intelligence, initiation, termination region, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.97)

Add feedback

Autonomous Curriculum Design via Relative Entropy Based Task Modifications

Satici, Muhammed Yusuf, Wang, Jianxun, Roberts, David L.

arXiv.org Artificial IntelligenceFeb-28-2025

Curriculum learning is a training method in which an agent is first trained on a curriculum of relatively simple tasks related to a target task in an effort to shorten the time required to train on the target task. Autonomous curriculum design involves the design of such curriculum with no reliance on human knowledge and/or expertise. Finding an efficient and effective way of autonomously designing curricula remains an open problem. We propose a novel approach for automatically designing curricula by leveraging the learner's uncertainty to select curricula tasks. Our approach measures the uncertainty in the learner's policy using relative entropy, and guides the agent to states of high uncertainty to facilitate learning. Our algorithm supports the generation of autonomous curricula in a self-assessed manner by leveraging the learner's past and current policies but it also allows the use of teacher guided design in an instructive setting. We provide theoretical guarantees for the convergence of our algorithm using two time-scale optimization processes. Results show that our algorithm outperforms randomly generated curriculum, and learning directly on the target task as well as the curriculum-learning criteria existing in literature. We also present two additional heuristic distance measures that could be combined with our relative-entropy approach for further performance improvements.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2502.21166

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > North Carolina > Wake County > Raleigh (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Industry: Education > Curriculum (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Layered LA-MAPF: a decomposition of large agent MAPF instance to accelerate solving without compromising solvability

Yao, Zhuo

arXiv.org Artificial IntelligenceOct-22-2024

Multi-Agent Path Finding (MAPF) has been widely studied in recent years. However, most existing MAPF algorithms assume that an agent occupies only a single grid in a grid-based map. This assumption limits their applicability in many real-world domains where agents have geometric shapes, rather than being point-like. Such agents, which can occupy multiple cells simultaneously, are referred to as ``large'' agents. When considering the shape and size of agents in MAPF, the computational complexity increases significantly as the number of agents grows, primarily due to the increased overhead in conflict detection between geometric agents. In this paper, we propose two types of subproblems for the LA-MAPF (Large-Agent MAPF) problem: \textbf{cluster} (which has no constraints on the order of solution) and \textbf{level} (which imposes constraints on the solution order). We introduce \textbf{Layered LA-MAPF}, a method that decomposes a MAPF instance involving geometric agents into clusters, and then further decomposes each cluster into levels. This approach aims to reduce time complexity when solving LA-MAPF problems. Our results demonstrate the performance of our method as the number of agents increases across various maps, and how it accelerates LA-MAPF methods, such as LA-CBS and LA-LaCAM. Experiments show that our LA-MAPF method with instance decomposition \textbf{halves the time cost (reducing from an average of 40s to 20s) and triples the success rate (from an average of 0.27 to 0.80)} in finding a solution within 60 seconds. To facilitate further research, we have made the source code for Layered LA-MAPF publicly available at \url{https://github.com/JoeYao-bit/LayeredMAPF/algorithm/LA-MAPF}.

agent, artificial intelligence, decomposition, (16 more...)

arXiv.org Artificial Intelligence

2410.1716

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.56)

Add feedback

Causal Contextual Bandits with Adaptive Context

Madhavan, Rahul, Maiti, Aurghya, Sinha, Gaurav, Barman, Siddharth

arXiv.org Artificial IntelligenceJun-2-2024

We study a variant of causal contextual bandits where the context is chosen based on an initial intervention chosen by the learner. At the beginning of each round, the learner selects an initial action, depending on which a stochastic context is revealed by the environment. Following this, the learner then selects a final action and receives a reward. Given $T$ rounds of interactions with the environment, the objective of the learner is to learn a policy (of selecting the initial and the final action) with maximum expected reward. In this paper we study the specific situation where every action corresponds to intervening on a node in some known causal graph. We extend prior work from the deterministic context setting to obtain simple regret minimization guarantees. This is achieved through an instance-dependent causal parameter, $\lambda$, which characterizes our upper bound. Furthermore, we prove that our simple regret is essentially tight for a large class of instances. A key feature of our work is that we use convex optimization to address the bandit exploration problem. We also conduct experiments to validate our theoretical results, and release our code at our project GitHub repository: https://github.com/adaptiveContextualCausalBandits/aCCB.

algorithm, conference paper, intervention, (16 more...)

arXiv.org Artificial Intelligence

2405.18626

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > India > Karnataka > Bengaluru (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
Europe > Portugal > Porto > Porto (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.47)

Add feedback

Model-based Offline Quantum Reinforcement Learning

Eisenmann, Simon, Hein, Daniel, Udluft, Steffen, Runkler, Thomas A.

arXiv.org Artificial IntelligenceApr-14-2024

This paper presents the first algorithm for model-based offline quantum reinforcement learning and demonstrates its functionality on the cart-pole benchmark. The model and the policy to be optimized are each implemented as variational quantum circuits. The model is trained by gradient descent to fit a pre-recorded data set. The policy is optimized with a gradient-free optimization scheme using the return estimate given by the model as the fitness function. This model-based approach allows, in principle, full realization on a quantum computer during the optimization phase and gives hope that a quantum advantage can be achieved as soon as sufficiently powerful quantum computers are available.

reinforcement, reinforcement learning, surrogate model, (17 more...)

arXiv.org Artificial Intelligence

2404.10017

Country: Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

A* Lasso for Learning a Sparse Bayesian Network Structure for Continuous Variables

Neural Information Processing SystemsMar-13-2024, 18:23:11 GMT

We address the problem of learning a sparse Bayesian network structure for continuous variables in a high-dimensional space. The constraint that the estimated Bayesian network structure must be a directed acyclic graph (DAG) makes the problem challenging because of the huge search space of network structures. Most previous methods were based on a two-stage approach that prunes the search space in the first stage and then searches for a network structure satisfying the DAG constraint in the second stage. Although this approach is effective in a lowdimensional setting, it is difficult to ensure that the correct network structure is not pruned in the first stage in a high-dimensional setting. In this paper, we propose a single-stage method, called A* lasso, that recovers the optimal sparse Bayesian network structure by solving a single optimization problem with A* search algorithm that uses lasso in its scoring system. Our approach substantially improves the computational efficiency of the well-known exact methods based on dynamic programming. We also present a heuristic scheme that further improves the efficiency of A* lasso without significantly compromising the quality of solutions. We demonstrate our approach on data simulated from benchmark Bayesian networks and real data.

lasso, network structure, qlim, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Asia (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Safe Mission-Level Path Planning for Exploration of Lunar Shadowed Regions by a Solar-Powered Rover

Lamarre, Olivier, Malhotra, Shantanu, Kelly, Jonathan

arXiv.org Artificial IntelligenceJan-16-2024

Exploration of the lunar south pole with a solar-powered rover is challenging due to the highly dynamic solar illumination conditions and the presence of permanently shadowed regions (PSRs). In turn, careful planning in space and time is essential. Mission-level path planning is a global, spatiotemporal paradigm that addresses this challenge, taking into account rover resources and mission requirements. However, existing approaches do not proactively account for random disturbances, such as recurring faults, that may temporarily delay rover traverse progress. In this paper, we formulate a chance-constrained mission-level planning problem for the exploration of PSRs by a solar-powered rover affected by random faults. The objective is to find a policy that visits as many waypoints of scientific interest as possible while respecting an upper bound on the probability of mission failure. Our approach assumes that faults occur randomly, but at a known, constant average rate. Each fault is resolved within a fixed time, simulating the recovery period of an autonomous system or the time required for a team of human operators to intervene. Unlike solutions based upon dynamic programming alone, our method breaks the chance-constrained optimization problem into smaller offline and online subtasks to make the problem computationally tractable. Specifically, our solution combines existing mission-level path planning techniques with a stochastic reachability analysis component. We find mission plans that remain within reach of safety throughout large state spaces. To empirically validate our algorithm, we simulate mission scenarios using orbital terrain and illumination maps of Cabeus Crater. Results from simulations of multi-day, long-range drives in the LCROSS impact region are also presented.

policy tree, rover, waypoint, (17 more...)

arXiv.org Artificial Intelligence

2401.08558

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > Canada > Quebec > Montreal (0.14)
(6 more...)

Genre: Research Report (0.82)

Industry:

Energy > Renewable > Solar (1.00)
Government > Space Agency (0.94)
Government > Regional Government > North America Government > United States Government (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Recovery Policies for Safe Exploration of Lunar Permanently Shadowed Regions by a Solar-Powered Rover

Lamarre, Olivier, Malhotra, Shantanu, Kelly, Jonathan

arXiv.org Artificial IntelligenceOct-21-2023

The success of a multi-kilometre drive by a solar-powered rover at the lunar south pole depends upon careful planning in space and time due to highly dynamic solar illumination conditions. An additional challenge is that the rover may be subject to random faults that can temporarily delay long-range traverses. The majority of existing global spatiotemporal planners assume a deterministic rover-environment model and do not account for random faults. In this paper, we consider a random fault profile with a known, average spatial fault rate. We introduce a methodology to compute recovery policies that maximize the probability of survival of a solar-powered rover from different start states. A recovery policy defines a set of recourse actions to reach a safe location with sufficient battery energy remaining, given the local solar illumination conditions. We solve a stochastic reach-avoid problem using dynamic programming to find an optimal recovery policy. Our focus, in part, is on the implications of state space discretization, which is required in practical implementations. We propose a modified dynamic programming algorithm that conservatively accounts for approximation errors. To demonstrate the benefits of our approach, we compare against existing methods in scenarios where a solar-powered rover seeks to safely exit from permanently shadowed regions in the Cabeus area at the lunar south pole. We also highlight the relevance of our methodology for mission formulation and trade safety analysis by comparing different rover mobility models in simulated recovery drives from the LCROSS impact region.

discretization resolution, rover, state space, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.actaastro.2023.09.028

2307.16786

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry:

Energy > Renewable > Solar (1.00)
Government > Space Agency (0.94)
Government > Regional Government > North America Government > United States Government (0.68)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.55)

Add feedback

Learning Search-Space Specific Heuristics Using Neural Networks

Liu, Yu, Kuroiwa, Ryo, Fukunaga, Alex

arXiv.org Artificial IntelligenceJun-6-2023

We propose and evaluate a system which learns a neuralnetwork heuristic function for forward search-based, satisficing classical planning. Our system learns distance-to-goal estimators from scratch, given a single PDDL training instance. Training data is generated by backward regression search or by backward search from given or guessed goal states. In domains such as the 24-puzzle where all instances share the same search space, such heuristics can also be reused across all instances in the domain. We show that this relatively simple system can perform surprisingly well, sometimes competitive with well-known domain-independent heuristics.

artificial intelligence, backward search, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2306.04019

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
Europe > Germany > Saarland (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Indexability is Not Enough for Whittle: Improved, Near-Optimal Algorithms for Restless Bandits

Ghosh, Abheek, Nagaraj, Dheeraj, Jain, Manish, Tambe, Milind

arXiv.org Artificial IntelligenceFeb-28-2023

We study the problem of planning restless multi-armed bandits (RMABs) with multiple actions. This is a popular model for multi-agent systems with applications like multi-channel communication, monitoring and machine maintenance tasks, and healthcare. Whittle index policies, which are based on Lagrangian relaxations, are widely used in these settings due to their simplicity and near-optimality under certain conditions. In this work, we first show that Whittle index policies can fail in simple and practically relevant RMAB settings, even when the RMABs are indexable. We discuss why the optimality guarantees fail and why asymptotic optimality may not translate well to practically relevant planning horizons. We then propose an alternate planning algorithm based on the mean-field method, which can provably and efficiently obtain near-optimal policies with a large number of arms, without the stringent structural assumptions required by the Whittle index policies. This borrows ideas from existing research with some improvements: our approach is hyper-parameter free, and we provide an improved non-asymptotic analysis which has: (a) no requirement for exogenous hyper-parameters and tighter polynomial dependence on known problem parameters; (b) high probability bounds which show that the reward of the policy is reliable; and (c) matching sub-optimality lower bounds for this algorithm with respect to the number of arms, thus demonstrating the tightness of our bounds. Our extensive experimental analysis shows that the mean-field approach matches or outperforms other baselines.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2211.00112

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > India (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
(4 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Public Health (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback